bagging predictor
An Investigation of Sensitivity on Bagging Predictors: An Empirical Approach
Liang, Guohua (University of Technology, Sydney)
As growing numbers of real world applications involve imbalanced class distribution or unequal costs for mis- classification errors in different classes, learning from imbalanced class distribution is considered to be one of the most challenging issues in data mining research. This study empirically investigates the sensitivity of bagging predictors with respect to 12 algorithms and 9 levels of class distribution on 14 imbalanced data-sets by using statistical and graphical methods to address the important issue of understanding the effect of vary- ing levels of class distribution on bagging predictors. The experimental results demonstrate that bagging NB and MLP are insensitive to various levels of imbalanced class distribution.
An Empirical Study of Bagging Predictors for Different Learning Algorithms
Liang, Guohua (University of Technology, Sydney) | Zhu, Xingquan (University of Technology, Sydney) | Zhang, Chengqi (University of Technology, Sydney)
Bagging is a simple yet effective design which combines multiple single learners to form an ensemble for prediction. Despite its popular usage in many real-world applications, existing research is mainly concerned with studying unstable learners as the key to ensure the performance gain of a bagging predictor, with many key factors remaining unclear. For example, it is not clear when a bagging predictor can outperform a single learner and what is the expected performance gain when different learning algorithms were used to form a bagging predictor. In this paper, we carry out comprehensive empirical studies to evaluate bagging predictors by using 12 different learning algorithms and 48 benchmark data-sets. Our analysis uses robustness and stability decompositions to characterize different learning algorithms, through which we rank all learning algorithms and comparatively study their bagging predictors to draw conclusions. Our studies assert that both stability and robustness are key requirements to ensure the high performance for building a bagging predictor. In addition, our studies demonstrated that bagging is statistically superior to most single base learners, except for KNN and Naïve Bayes (NB). Multi-layer perception (MLP), Naïve Bayes Trees (NBTree), and PART are the learning algorithms with the best bagging performance.